tele9752wikiaorg-20200213-history
Advanced topic 4, 2012
Opportunistic Flow-Level Latency Estimation Using Consistent NetFlowM. Lee, N.G. Duffield, and R.R. Kompella, "Opportunistic Flow-Level Latency Estimation Using Consistent NetFlow", ;presented at IEEE/ACM Trans. Netw., 2012, pp.139-152. M. Lee, N.G. Duffield, and R.R. Kompella, "Opportunistic Flow-Level Latency Estimation Using Consistent NetFlow", ;presented at IEEE/ACM Trans. Netw., 2012, pp.139-152 The content of this wiki page was adapted from the paper above. All credit goes to original authors. (Topic 4 from Advanced Topics 2012) 'INTRODUCTION' After reading this paper the reader should be able to understand the following concepts: *What is latency and why is it important? *What are flow records and how are they useful? *What is consistent NetFlow? *What is Opportunistic Latency Estimation? *What are the experimental results of this study? What is latency and why is it important? Latency is the delay in data propagation introduced by links and network devices. Bandwidth measures how much data can be shipped in a second, while latency measures how many milliseconds it takes a single bit or packet of data to get to its destination. In modern networks bandwidth bottlenecks can be resolved by the addition of extra links. Latency problems are much harder to resolve. Causes of network latency arehttp://en.wikipedia.org/wiki/Latency_vs_Bandwidth *The speed of light - 3 x 10^8 m/s - is the lower limit on propagation of all electromagnetic signals. In data networks the fastest that can be achieved is about half this. This latency is proportional to the length of the links between network devices *Switching and processing latency is the delay introduced by devices in the data path, and ranges from 1 millisecond for a fast switch up to 10 milliseconds for a slower firewall. Analog modems would introduce up to 50 milliseconds latency at each end of a link. In general the more CPU processing (firewalls, load balancers) the more latency is introduced. Less latency is introduced by Layer 2 switching than layer 3 routing The combined effects of link and device latency result in the latency percieved by the end user. Typical latencies seen on a broadband packet network are: #Within Sydney 10 milliseconds #Within Australia - 30 - 100 milliseconds #Australia to the US - 200 milliseconds What are flow records and how are they useful? Routers take a packet from the ingress port, look up a table to determine where to send it, swap out the destination mac address and forward the packet out the egress port indicated by the routing table. NetFlowhttp://en.wikipedia.org/wiki/NetFlow was originally a routing technology developed by Cisco around 1990. Prior to NetFlow Cisco routers sent all packets to the router CPU to determine the next hop. NetFlow cached the results of the first lookup in a cache table. Future lookups then matched the destination against the cache reducing the table lookup time. The NetFlow cache also contained the source IP, source port and destination port so that if the packet had some rules applied, for example ACL filtering, and a flow was created matching these parameters, further packets would not require a CPU lookup if they matched the cached flow. When new hardware was introduced that offloaded routing decisions to specialised chips on the router, the NetFlow cache was no longer required for routing, but the records were still in the router. It was discovered that these records contained useful information about the network and avoided the need to implement more complex RMON strategies to get Layer 3 information from network devices for network management purposes. NetFlow information may contain: *Flow usage counters *Start time and stop time of the flow *Interfaces used *QoS flags *IP addresses *Application ports *Routing information NetFlow has been through multiple iterations and is now up to Version 9 which uses templates to determine the format of records which contain up to 87 fields. Due to the size and volume of NetFlow records usually only a sample of the records is exported to a collector which stores the data for analysis. Vendors other than Cisco use similar but incompatible flow recording mechanisms. A standardisation effort is underway at the IETF for the method of exporting flow records (IPFIX)http://en.wikipedia.org/wiki/IPFIX and the method of sampling flow records (PSAMP)http://www.ietf.org/rfc/rfc5473.txt. What is consistent NetFlow? Given that we have collected all these flow records from multiple routers across the network, the problem that arises is that if a flow is an entity independent of the router, how do we match flow records from multiple routers to a single flow? This problem is more intractable than it first seems. Problems associated with matching flows across routers include *Time synchronisation of routers may not be accurate so time stamps don't match *Packet loss *Cache expiry due to load *Different sampling across routers - may be random In 2000 to 2002 trajectory sampling N. G. Duffield and M. Grossglauser, “Trajectory sampling for direct traffic observation,” IEEE/ACM Trans. Netw., vol. 9, no. 3, pp. 280–292, Jun. 2000 was suggested which proposed sending a packet with a special hash payload across the network that could easily be identified. By measuring the arrival time of this packet at each hop, the network latency could be inferred. In 2009 this idea was extended to IETF PSAMP in the following way: *Sample packets at each link *Pseudo random sampling (e.g., 1-out-of-100) *Compute a hash over the invariant fields (same on each hop) of the packet *Packet is selected for reporting (export) if the hash falls within a given range If all routers use the same hash, same input fields and same selection range, the flows that are exported will match between routers, with the result being consistent selection. What is Opportunistic Latency Estimation? Invariant fields in a flow include source and destination IP and source and destination port. These fields define the flow. In hash-based flow sampling the hash is computed over these fields. Including the packet payload in the hash function produces a hash-based packet sample. Packet sampling has some advantages over flow sampling: *Flow sampling can be bursty and uneven due to CPU processing cost associated with processing the flow, while packet sampling smoothes the selection process *Packet sampling is more compatible with traditional approaches to network management tasks which often require analysis of individual packets *Packet sampling preferentially selects large flows which are more often the subject of measurement than small flows Only completed flows have time stamps, one at the beginning of the first packet in the flow and one at the end of the last packet. Some method is needed to estimate the latency of packets within the flow to facilitate packet sampling. Opportunistic estimation makes use of the statistics of smaller flows that occur at the same time as the larger flow being investigated to estimate the start and finish times for individual packets within the flow. Some examples: *the packet delay of a flow of a single packet is the same as the flow delay *the packet delay of a flow of only two packets can be estimated as the delay of the flow divided by two - this is called the end-point estimator *for flows consisting of large numbers of packets, the delay of each packet can be estimated by correlating with the time stamps of smaller flows that occur at the same time as the larger flow - this is called the multiflow estimator. '' What are the experimental results of this study? In this paper, applying a threshold to the number of packets and using the end-point estimator for smaller flows and the multiflow estimator for larger flows is proposed as a heuristic - this is called the ''hybrid estimator. In addition to using intermediate flow delays to estimate packet delays, linear interpolation is used to construct an estimate for delay times for any packet in the flow under study. In this study the interpolated delays are estimated and compared to the actual packet delays to estimate the accuracy of the heuristic. 'RELATED WORK' Compared with existed technique, there are two mainly improvement: 1.It could provide per-flow latency measurement, however onther techniques could only provide per-hope lantancy statistic. 2.It could achieve passive measurement which means no active probes. This technique borrowed some techniques from other domains: *Consistent hashing from trajectory sampling proposed by Duffield in order to ensuring that consistent streams of packets are observed at two routers. *Directly comparing trajectory samples of the same packet observed at different points in order to measure loss and delay passively. *Packet Doppler is a technique to observe how packets that arrived at a router ina window interval get distributed in a discrete time series ofthe same window interval at the other router. * *Lossy difference aggregator(LDA) which can be implemented in routers at high speedsfor measuring various latency characteristics with microsecond precision. 'FLOW COLLECTION ARCHITECTURE' There are two delay estimation methods. In the Endpoint method,we use only delay of the first and last flow packets to estimatethe average delay of packets in a flow. In the Multiflow method,we also use delays of the first and last packet of other flows tosupplement the delay average. A Hybrid estimator combines the best performance of each. 'A. Current NetFlow Architecture' This figure is the overview of the Consistent NetFlow architecture. There are three routers R1,R2 and R3. Netflow instances labeled A and D are in the ingress direction s of two interfaces of router R1. Flows in the egress directions of these interfaces of R1 will be captured at the ingress direction of routers R2 and R3. The black squares in this figure means flow cache, it is used to keep flow record. Each flow record is created by aggregating all packets that share some common fields in the header, called the flow key. 'B. Prerequisites' There are two two basic assumptions the approach relies on: *Accurate time synchronization: The fundamental requirement for any architecture that wishes to enable accurate one-way delay measurements. *Packet Forwarding Order: The stream of packets follows a serial order (FIFO) between the monitoring points. 'C. Consistent NetFlow' *Approach to Measure Dalay Delay characteristics of the forwarding path between two interfaces within a router, or a link across routers which means from the egress interface of one router to the ingress of the other will be measured. The sum of both of these delays will also be measured since NetFlow is turned on in the ingress direction. For instance, the delay between NetFlow instances A and B in the above figure will be measured. Forwrding path latencis in R1 from input port A to output port connected to R3, and the link delay between R1 and R3 will be essentially measured. We assume that the timestamps of the first and last packets for the set of sampled flows have been stored in the NetFlow instances A and B. The key point is that if both A and B sample the same first and last packets for some flow recorded by both (in their set of sampled flows), then we can compute the delay bserved by these packets from the timestamps recorded at both A and B. *Problem Both A and B sample packets are completely independently and the number of commonly recorded flows is small. and thus the first and last packets may be different. As a result, the timestamps recorded at both ends are not consistent and hence the delays which are calculated according to these data may be wrong. *Solution In order to address this poblem, CNF(Consistent NetFlow) is proposed, which means NetFlow equiped with the additional modification that routers use hash-based sampling. Consequently, in CNF, the same packet set will be selectd independently an the timestamps are consisitent as well. *Sampling Mechanisms #flow sampling : according to the size of the selection range divided by the size of the set of of possible hash values, the average sampling rate can be determined.An input that uses only invariant fields from the flow key would result in selecton of all packets in a subset of flows. #packet sampling: Including hash input from the packet payload as well would result in selection that looks statistically similar to uniform random sampling, except that packets are selected consistently between different routers. 'D. Flow Correlation' A centralized flow collector will collect the flow records transmitted by the NetFlow process on each routers.But flow records obtained at any downstream NetFlow instance may contain flows that would have potentially taken different paths at the upstream router.Thus, some measures should be taken to obtain the properties of the path between a pair of NetFlow instances. *First Step In order to obtain the intersection set of the flow records obtained from the endpoints.Through the flow keys present in the flow records, the base set of flow records can be easily obtained. *Second Step After sorting records at each instance using the start timestamp, pairwise association can be performed since each flow records has a one-to-one correspondence. *Problems In practice, the theory results can not be achieved for several reasons as follows: #packet loss; #differences in application of flow timeouts due to slight difference in interpacket times; #different cache expiry events due to heavy loads; #uncertainties in timestamps due to propagation delay and clock synchronization issues; *Solutions #Mapping Packet Label to a Timestamp: the NetFlow is required to report a label of the first and last packet in addition to timestamp.In this way, the records can be associate not only through the timestamps but the labels. #Timing Checks to Eliminate Inconsistencies:without requiring changes to the current NetFlow architecture, this solution only utilizes timestamps to identify inconsistencies. Explanation of this solution: The sender exports a set of flow records i with fields （τsi,1，τsi,2，nsi，bsi）being the initial and final packet timestamps, number of packets, and bytes, respectively. The receiver-side flows have orresponding fields（τri,1，τri,2，nri，bri）note the same index on sender and receiver side is not assumed to riginate in the same set of packets. We now consider four different quantities e1+ e1- e2 e3 that represent timing uncertainty. *The propagation delay lies in the interval e1- *The packet processing latency is less than e2 *The clock uncertainty is less than e3 Let TinandTact denote the inactive and active flow timeouts,respectively.Criteria (C1)–(C5) applied to matching sender flow and receiver flow . (C1) τri,1 τsi,1 +[ e1-- e3 , e1++ e2+ e3] and τri,2 τsi,2 +[ e1-- e3 , e1++ e2+ e3] This condition says the first and last packet times of receiver flow are compatible with first and last packet times of sender flow , given the uncertainties of propagation, queuing, and clock synchronization. (C2) nsj= nri and bsj= bri Equality of bytes and packets is a necessary condition for the flows to be reporting on the same set of packets. (C3) τri,2-e1-+e3<τsj,1 +Tact .The upper bound for latest possible compatible sending time of the last flow packet seen at the receiver τri,2-e1-+e3 should fall within the active timeout since the first sender packet. (C4) τri,2-e1-+e3 < τsj,2 +Tin This condition is similar to (C3), except that it rules out sender packets being excluded from the flow due to inactive timeout since the last sender packet. (C5)If a pair of sender and receiver flows (of a particular key) has been discarded, discard all subsequent sender and receiver flows (of the same key) until the separation between successive flows exceedsTin on both sender and receiver sides. This is used because after discard of a pair of flows for violating any of conditions (C2)–(C4), the first times of subsequent flows may refer to different packets on the sender and receiver sides. The condition ensures discard of such “out of sync” flows. 'E. Foundations of Delay Correlation' Once pairs of flow records are associated, we have sender and receiver timestamps from both its first and last packets.We discuss the key foundation of our approach to estimate per-flow latency characteristics of the particular segment using these packet timestamps . In the simplest cases, the lag n autocorrelation of the packet queuing delay, rn ,obeys rn = 1-n(1-p)2 +O(1-p)3 (1) This formulashows that under heavy traffic.noticeable correlations persist at lag n for n up to 1/(1-p)2 In general,rn is a convex function and hence the correlations in practice decay more slowly than the dominant linear behavior in n of (1) and Internet traffic exhibits burstier arrivals than the simple models. 'F. Latency Estimation' There are three ways that we can use to estimate the latency of the flow. 1. Endpoint Estimator *Estimate the latency of the flow by measuring the latency of the first and last packets of the flow d1 and d2. *Average the delays of the first and last packets of the flow, the formula is shown in figure 1. *This approach has limited accuracy in general, but if the flow is very small, it could be more accuracy. 2. Multiflow Estimator *Average the delay over all packets with sender timestamps between the sender timestamps of the flow. the formula is shown in figure 2. *In figure 2, the ΔMF is the set of packet delays associated with the flow and δ is a delay sample in the set ΔMF. 3. Hybrid Estimator *Hybrid Estimator method tries to blend the best of the Endpoint and Multiflow methods. *By doing Hybrid Estimator method, we must use the Endpoint method for smaller flows and the Multiflow method for larger flows. 'G. Variance and Its Estimation' 1.Problem *Given the expected correlations between delay measurement, how well is the variability of a single measurement predicted? *How does the variability of set of samples relate to that of the packetss in the flow under study itself? *The mean of te statistic of the delay we measured is not enough to well-describe the distribution of the delay. 2. Variance is required for describing the statistic of the delay *Variance can tell us more about the range of the change. *The unbiased estimate of the variance is shown in figure 3. In this formula, D is the sample mean and Δ is the set of samples. *For the Endpoint estimator, the quantity σ² is expected to be extremely noisy, being based on only two data points. 'H. Interpolation of Packet Delays' 1. Problem *If we want to use the endpoint packet delay to estimate the delay of arbitrary packet, the correlation between two packets should be strong enough. *It is hard to construct an setimator in this way since we don't know the sender time of arbitrary packet. 2. Linear interpolation between endpoint delays *Linear interpolation can construct a delay value for any time, we compare it to packet delays of arbitrary flows. *For any time between Τ1 and Τm, let i+® and i-® label the closest delay values in the future and past. The formula is shown in Figure 4. *After that, we can calculate the interpolated delay. The formula is shown in Figure 5. *We can see from the Figure 5 that the delay we predict is linear to the delay we already known. 'EVALUATION' We evaluate the efficacy of Multiflow estimator using two data sets *'real backbone packet header traces with different synthetic queueing models' *'real router traces with synthetic workloads' 'A. Experimental Setup' #'packet header trace:' #*'First data set is two real backbone packet header traces' #*'Second data set is collected in an evaluation of new active probing tool' #'flow collection tool:' #*'Use an open-source NetFlow platform and modify it to support hash-based packet sampling and flow sampling.' #*'In this case, we not only apply it to fist and last packets, we can also apply the condition for every packet as we want to know delays of all packets for the evaluation.' #'Delay models:' #*'Weibull distribution: we model packet delays in real backbone rounters and model packet losses as a uniform distribution.' #*'Queueing models (Droptail queue and RED queue): we control the packet delays by configuring queue length and drain rate. ' 'B. Associating Flow Records' ' ' *'This table shows the proportion of flows surviving after filtering using the timing checks and all the results are cumulative.' *'The last column represents the percentage of flows obtained using label-based consistency checks and we can notice that it related to packet loss' *'The most important step is C1, after C1 we already filter most of the inconsistencies out.' ' 'C. Estimator Accuracy' *Comparison to Active Probes This figure shows the comparison results between Multiflow estimator and active probes and we know from it that our Multiflow estimator can keep track of variability in per-flow latencies better than active probes *'Accuracy Depending on Delay Models and Sampling Methods' ' *'Accuracy With Respect to Flow Duration' *'Comparison to Interpolation and Trajectory Sampling' 'D. Sampling and Loss Rate Variation' : To manage the effective number of sampled packets, two parametres are necessary: packet sampling rate and loss rate. As works down in previous pages shows that Multiflow estimation perform much better than Endpoint does. So in sampling and loss rate variation, we only use Multiflow estimator. *'Impact of sampling rate' :: ' '''To show how sampling rate will influence the performance of estimation accuracy, we still use CHIC and IPLS traces which use RED queue and vary packet sampling rate from 0.0001 to 0.1 As shown left, relative errors decrease with the packet sampling rate increasing and for the IPLS trace, it remains stable after a sampling rate of 0.001 But for small flow(of size ≤100), the relative errors will also increase. This is because estimated delays now are aggregated over a larger number of samples, which benefits larger flows, but reduces smaller flows’accuracy. *'Impact of Packet Loss Rate ' :: ' To show the effect of packet loss rate, we changing the drain rate to set the loss rate from 0% to 5% for both CHIC and IPLS traces. we set 5% loss rate to test how the estimator works under such high rates but normally 1% loss rate is the acceptable maximun loss rate.Contrary to intuition, relative error reduces as increasing the packet loss rate. This is because in RED or Droptail queues, high loss rate implies that the queue is almost always full. Thus, the delay distribution is a lot more stable and hence easier to predict even with smaller number of samples. :: We also use real router traces with synthetic workloads—WISC traces to estimate the accuracy under packet loss rate. We use three traces WISC-R1, -R2 and -R3 have small (0.01%), medium (0.12%), and high (4.59%) packet loss rates, respectively.As shown left, relative error also reduces as increasing the packet loss rate. Specifically, Multiflow estimator achieves a median relative error of 0.83% by WISC-R3, 8.20% by WISC-R2, and 13.75% by WISC-R1. '''E. Accuracy of Standard Deviation Estimates :: The increasing of packet loss rate leads the decreasing of relative error of standard deviation of flow-level latency. As shown left (a is for Multiflow estimator and b is for Endpoint estimator), both CHIC and IPLS traces using Multiflow and Endpoint estimation show the same trend. But the estimation of Endpoint cannot be trusted as Multiflow because of its poor accuracy.Specifically, the median relative errors of Endpoint are over 80% for IPLS and 50% for CHIC. Even in the case of high packet loss rate (say, 4%–5%), Endpoint still cannot be used due to so much variation in estimation accuracy. At around 4%–5% packet loss rate, the relative error of Multiflow on CHIC increases slightly. This is because as link utilization become higher the variance of packet delay becomes very small, and the standard deviation of packet latencies of a flow becomes smaller compared to the case of low link utilization. Since relative error is prone to be more erroneous when the small value needs to be estimated, the relative error of the standard deviation becomes large. :: We still use WISC traces which have same settings as used in part D to check the accuracy again. As shown left, the increase of the packet loss rate also reduces the relative error of standard deviation of flow-level latency. But the improvement in accuracy of standard deviation estimates among traces is less than that in mean estimation accuracy. 'CONCLUSION' In conclusion, the main points made in this paper are: *Problem being solved *Proposal *Experimental evaluation *Results *Criticisms Problem being solved NetFlow provides valuable information about network traffic for network management. It also provides time stamps at the start and end of every flow, and these time stamps should be able to be used to as data for measurement for network latency. Currently NetFlow is used only in the context of individual routers because flow records are not correlated between routers. Flow records cannot be used for network latency measurements as the scope of each record only spans the individual router. Furthermore NetFlow is Cisco proprietary and not universally supported among equipment vendors. Hash-based sampling has been proposed as part of the IETF PSAMP standard that will allow correlation of flow records and standardise these across all vendors. Once hash-based sampling is available for flow records these records can be correlated among many routers network latency measurements can finally be gathered passively using NetFlow by comparing time-stamps from different routers for individual flows. In addition to measuring latency per flow, it would be useful to obtain an estimate of latency for each packet within the flow. This paper attempts to extend the use of flow time-stamps by using the time-stamps of background flows to estimate the per-flow average and standard deviation of delays within the flow under study. Proposal The authors propose a heuristic approach to estimating delays by recording the time-stamps on shorter background flows that occur within the time span of the longer flow under study. This approach involves using a threshold to determine which estimator to use (end-point or multiflow) using linear interpolation to estimate the average flow delay Experimental evaluation The raw data is in the form of packets and the experimental problem is to measure the accuracy of flow time stamps when the only known timestamps are those of the packets. The experiment could therefore be done in two ways - convert the flow time stamps to packet time stamps and compare the results, or convert the packet time stamps to flow time stamps and compare. The authors chose the first option leaving second option an opportunity for further study. The experiment applied various traffic distribution models (queueing models - Droptail queue and RED queue, and Wiebull distribution) to create actual timestamps on the packets at different points in the network. Then a series of estimators were used to estimate the packet time stamps from the Flow time stamps. At this point the authors could compare the estimated packet time stamps with the actual packet time stamps to determine the acuracy of the experimental technique and also of the original proposition - that the use of hash based flow sampling would allow accurate correlation of Netflow records between routers. Two data sets are used - one from the Internet backbone and one synthetic set. The flow collection tool (YAF) is modified to support hash-based packet sampling and hash-based flow sampling. The backbone data set already has real delays added, so the synthetic data set is fed through a real router to create delays using RED and Droptail queuing mechanisms, and a Weibull theoretical delay distribution is also used. To evaluate the heuristics the estimated latencies are compared to the actual latency of the packets which are known from the data. Association of packet flows using timing constraints is compared to association using hash-based packet labeling. Accuracy is measured by plotting the CDF against the relative error of delay estimates for the multiflow estimator and active probes and the multiflow estimator and the endpoint estimator. Multiflow is compared to endpoint for different flow sizes and durations. The comparison to Interpolation and Trajectory Sampling was interesting. In the trajectory sampling approach packets are consistently hash-sampled across routers, and reports on the sampled packet include a (distinct) hash label with which to associate reports on the same packet. The association from label to flow is accomplished using an augmented report that includes both, which, to save reporting bandwidth, needs to be reported only once in the network, e.g., at an ingress router. To obtain per-flow latency estimates in Trajectory sampling, they group together packet samples associated with the flow key and compute the packet average latency. Results Estimator accuracy The multiflow estimator was more accurate than either the endpoint estimator or active probes for packet sampling. For flow sampling the endpoint estimator is more accurate. Over a range of flow sizes, endpoint performs better up to about size 3-4 and then accuracy decreases. The flow sampling above contained a large number of small flows which was why the endpoint estimator was more accurate. Criticism of the paper Overall this paper follows the following path: *We have NetFlow records on routers and they are useful for network management *We still do not use the time-stamps in NetFlow records to their best advantage because NetFlow records are not correlated between routers *IETF PSAMP will standardise hash-based sampling, so when they do and it becomes commercially available then correlation of NF records will be possible network-wide. *Metrics can be developed that utilize background flow delays to estimate per-flow average delays and standard deviations Examining each of these statements critically: *NF records are useful for network management, but problems that are not addressed here are **NF is resource intensive **Resources used by NF could be needed for data traffic *Some network management systems (Riverbed, Tenable) correlate NF records, however not in the deterministic manner as proposed here. *Given that this approach still relies on sampling, NF will still not replace Wireshark and network sniffing in the network management tool for packet capturing, and SNMP will still be used for lower level utilization reporting. NF sits midway between the two. *PSAMP is not commercially available yet (to my knowledge), so this approach is still evolving *The utility of per-flow average delays and standard deviations is not clear and may not be known until it becomes commercially available (if ever). 'REFERENCES'